Picture for Zhenhailong Wang

Zhenhailong Wang

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Add code
May 28, 2026
Viaarxiv icon

Advancing Creative Physical Intelligence in Large Multimodal Models

Add code
May 25, 2026
Viaarxiv icon

OSExpert: Computer-Use Agents Learning Professional Skills via Exploration

Add code
Mar 09, 2026
Viaarxiv icon

Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning

Add code
Feb 05, 2026
Viaarxiv icon

ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Add code
Oct 14, 2025
Viaarxiv icon

Multimodal Policy Internalization for Conversational Agents

Add code
Oct 10, 2025
Viaarxiv icon

Perception-Aware Policy Optimization for Multimodal Reasoning

Add code
Jul 08, 2025
Figure 1 for Perception-Aware Policy Optimization for Multimodal Reasoning
Figure 2 for Perception-Aware Policy Optimization for Multimodal Reasoning
Figure 3 for Perception-Aware Policy Optimization for Multimodal Reasoning
Figure 4 for Perception-Aware Policy Optimization for Multimodal Reasoning
Viaarxiv icon

DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs

Add code
Apr 23, 2025
Figure 1 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Figure 2 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Figure 3 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Figure 4 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Viaarxiv icon

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

Add code
Mar 03, 2025
Viaarxiv icon

Synthia: Novel Concept Design with Affordance Composition

Add code
Feb 25, 2025
Viaarxiv icon